Skip to content

WIP | Add input and output versioning schema docs#225

Open
nikhilNava wants to merge 1 commit intomainfrom
nikhilc/input-output-message-schema
Open

WIP | Add input and output versioning schema docs#225
nikhilNava wants to merge 1 commit intomainfrom
nikhilc/input-output-message-schema

Conversation

@nikhilNava
Copy link
Copy Markdown
Contributor

No description provided.

@nikhilNava nikhilNava requested a review from a team as a code owner March 25, 2026 15:18
Copilot AI review requested due to automatic review settings March 25, 2026 15:18
@nikhilNava nikhilNava changed the title Add input and output versioning schema docs WIP | Add input and output versioning schema docs Mar 25, 2026
@github-actions
Copy link
Copy Markdown

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 93b0e97.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds versioned documentation (JSON Schemas + reference docs/examples) for the gen_ai.input.messages and gen_ai.output.messages span attributes in the Observability Runtime docs.

Changes:

  • Added JSON Schema definitions for A365 input and output message formats (based on OTel GenAI semconv v1.40.0).
  • Added schema versioning/compatibility documentation and a changelog.
  • Added example payloads for input/output message attributes.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
src/Observability/Runtime/docs/schemas/a365-output-messages.json Defines the output messages JSON schema (roles, parts, finish reasons).
src/Observability/Runtime/docs/schemas/a365-input-messages.json Defines the input messages JSON schema (roles, parts).
src/Observability/Runtime/docs/schemas/SCHEMA-VERSION.md Documents current schema version, OTel baseline/commit, and applicability.
src/Observability/Runtime/docs/schemas/EXAMPLES.md Provides concrete JSON examples for input/output message attributes.

"type": { "const": "uri" },
"mime_type": { "anyOf": [{ "type": "string" }, { "type": "null" }], "default": null },
"modality": { "anyOf": [{ "$ref": "#/$defs/Modality" }, { "type": "string" }] },
"uri": { "type": "string", "description": "URI referencing attached data." }
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uri field is intended to be a URI, but it’s currently only typed as string. Adding format: "uri" (or uri-reference) would make the schema validate examples/values more effectively.

Suggested change
"uri": { "type": "string", "description": "URI referencing attached data." }
"uri": { "type": "string", "format": "uri", "description": "URI referencing attached data." }

Copilot uses AI. Check for mistakes.
"type": { "const": "blob" },
"mime_type": { "anyOf": [{ "type": "string" }, { "type": "null" }], "default": null, "description": "IANA MIME type." },
"modality": { "anyOf": [{ "$ref": "#/$defs/Modality" }, { "type": "string" }], "description": "General modality of the data." },
"content": { "type": "string", "format": "binary", "description": "Base64-encoded binary data." }
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlobPart.content is documented as base64-encoded data, but the schema uses format: "binary", which is not a portable way to express base64 in JSON Schema. Consider using contentEncoding: "base64" (and optionally contentMediaType) so validators can reliably enforce the encoding.

Suggested change
"content": { "type": "string", "format": "binary", "description": "Base64-encoded binary data." }
"content": { "type": "string", "contentEncoding": "base64", "description": "Base64-encoded binary data." }

Copilot uses AI. Check for mistakes.
"type": { "const": "uri" },
"mime_type": { "anyOf": [{ "type": "string" }, { "type": "null" }], "default": null },
"modality": { "anyOf": [{ "$ref": "#/$defs/Modality" }, { "type": "string" }] },
"uri": { "type": "string", "description": "URI referencing attached data." }
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uri field is intended to be a URI, but it’s currently only typed as string. Adding format: "uri" (or uri-reference) would make the schema validate examples/values more effectively.

Suggested change
"uri": { "type": "string", "description": "URI referencing attached data." }
"uri": { "type": "string", "format": "uri", "description": "URI referencing attached data." }

Copilot uses AI. Check for mistakes.
{ "type": "text", "content": "Summarize the attached document, check the weather in Seattle, and describe the whiteboard photo." },
{
"type": "file",
"modality": "image",
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example sets modality to image while mime_type is application/pdf. That combination is confusing for readers and doesn’t match the modality meaning in the schema description; consider changing modality to a more appropriate value (or update the example to use an actual image MIME type).

Suggested change
"modality": "image",
"modality": "document",

Copilot uses AI. Check for mistakes.
"type": "blob",
"modality": "audio",
"mime_type": "audio/wav",
"content": "/9j/4AAQSkZJRg..."
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blob example declares mime_type: audio/wav and modality: audio, but the base64 prefix /9j/ is commonly associated with JPEG data. Consider updating either the MIME/modality or the sample base64 so the example is internally consistent.

Suggested change
"content": "/9j/4AAQSkZJRg..."
"content": "UklGRiQAAABXQVZFZm10IBAAAAABAAEAIlYAAESsAAACABAAZGF0YQAA..."

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +163
### Complex — All Output Part Types

A response demonstrating every output part type: reasoning, text, tool_call, server_tool_call with response, and a custom generic part. Also shows multiple finish reasons.
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section claims the JSON demonstrates “every output part type” and includes “a custom generic part”, but the example shown does not include tool_call_response, file, or any custom GenericPart instance. Either adjust the claim/header, or extend the example to actually include the missing part types.

Suggested change
### Complex — All Output Part Types
A response demonstrating every output part type: reasoning, text, tool_call, server_tool_call with response, and a custom generic part. Also shows multiple finish reasons.
### Complex — Multiple Output Part Types
A response demonstrating several output part types: reasoning, text, server_tool_call with response, and image content as both blob and URI.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +41
| A365 Scope | Uses This Schema? | Notes |
|---|---|---|
| `InvokeAgentScope` | ✅ | Full conversation context |
| `InferenceScope` | ✅ | Chat history + model response |
| `OutputScope` | ✅ | Outgoing messages to user |
| `ExecuteToolScope` | ❌ | Uses `gen_ai.tool.*` attributes instead |
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema describes gen_ai.input.messages as a JSON array of structured message objects, but the current runtime scopes (e.g., InferenceScope/InvokeAgentScope/OutputScope) set this tag to a comma-separated string via string.Join. Either update this doc to clarify it’s a JSON-encoded value produced by specific instrumentations (and not by the built-in scopes today), or update the scopes/tests to emit JSON conforming to this schema.

Copilot uses AI. Check for mistakes.
"type": { "const": "blob" },
"mime_type": { "anyOf": [{ "type": "string" }, { "type": "null" }], "default": null, "description": "IANA MIME type." },
"modality": { "anyOf": [{ "$ref": "#/$defs/Modality" }, { "type": "string" }], "description": "General modality of the data." },
"content": { "type": "string", "format": "binary", "description": "Base64-encoded binary data." }
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlobPart.content is documented as base64-encoded data, but the schema uses format: "binary", which is not a portable way to express base64 in JSON Schema. Consider using contentEncoding: "base64" (and optionally contentMediaType) so validators can reliably enforce the encoding.

Suggested change
"content": { "type": "string", "format": "binary", "description": "Base64-encoded binary data." }
"content": { "type": "string", "contentEncoding": "base64", "description": "Base64-encoded binary data." }

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants